Bayesian Melding
Background for the Approach
The Bayesian melding approach was proposed by Poole et al. [@poole2000].
This approach enables us to account for both uncertainty from inputs and outputs of a deterministic model. The initial motivation for the approach was to study the population dynamics of whales in the presence of substantial uncertainty around model inputs for population growth [@poole2000]. However, the framework provided by Poole et al. can applied in any circumstance where we have uncertainty around some quantities \(\theta\) and \(\phi\) where there is a deterministic function \(M:\theta \to\phi\). Due the utility of Bayesian melding in various contexts, since this deterministic model \(M\) could take on a wide range of forms, the approach has since been applied in various fields, including urban simulations [@sevcikova2007], ecology [@robson2014], and infectious disease [@powers2011].
Let \(M: \theta \to \phi\) be the deterministic model defined by the function relating a vector of input parameters \(\theta\) to an output vector \(\phi\), and suppose we have a prior on \(\theta\) denoted \(f_\theta(\theta)\) and a prior on \(\phi\) denoted \(f_\phi^{direct}(\phi)\).
However, note that we actually have two distinct priors on \(\phi\). There is the prior formed by the distribution induced on \(\phi\) by the prior for \(\theta\) and the function \(M\), where we denote this induced prior \(f_\phi^{induced}(\phi)\). Generally, these priors are based on different sources of information.
If \(M^{-1}\) exists, we can write this induced prior \(f_\phi^{induced}(\phi) = f_\theta(M^{-1}(\phi)) |J(\phi)|\). This result follows from the fact \(M(\theta) = \phi\), so we apply a change of variables to obtain the distribution of \(\phi\) from the distribution of \(M(\theta)\).
In practice, \(M^{-1}\) rarely exists exists since \(\theta\) is often of higher dimensionality then \(\phi\), in which cases \(M\) is not invertible. This means we generally approximate \(q^*_1(\phi)\) without acquiring its analytical form.
In addition to this induced prior, we have the prior \(f_\phi^{direct}(\phi)\), which does not involve \(M\) nor the inputs \(\theta\). Since these priors are based on different sources of information and may reflect different uncertainties, often it useful to use both sources of information to inform our estimates. To do so, we need to combine the distributions for \(q^*_1(\phi)\) and \(f_\phi^{direct}(\phi)\) to create a pooled distribution.
Multiple pooling strategies exist for distinct distributions, but one requirement for a Bayesian analysis is that the distribution should be independent of the order in which the prior is updated and the combining of the prior distribution. That is, updating the prior distributions using Bayes’ theorem and then combining distributions should yield the same result as combining distributions and then updating this combined distribution; pooling methods that have this property are deemed externally Bayesian. Logarithmic pooling has been shown to be externally Bayesian under some conditions, which are likely to hold in most settings. Furthermore, logarithmic pooling has actually been shown to be the only pooling method where this holds [@genest1986]. For this reason, Poole et al. recommend proceeding with logarithmic pooling for Bayesian melding.
The logarithmically pooled prior for \(\phi\) by pooling \(q^*_1(\phi)\) and \(f_\phi^{direct}(\phi)\) is proportional to
\[(f_\phi^{induced}(\phi))^{\alpha} (f_\phi^{direct}(\phi))^{1-\alpha}\]
where \(\alpha \in [0,1]\) is a pooling weight. Commonly, a choice of \(\alpha = 0.5\) is used to give the priors equal weight. In this case, logarithmic pooling may be referred to as geometric pooling since it is equivalent to taking a geometric mean.
If \(M\) is invertible, we can obtain the contrained distributions for the model inputs by simply inverting \(M\). However, \(M\) is rarely invertible, so we have to think about how to proceed in the noninvertible case.
To get intuition for a valid strategy Poole et al. recommend, we consider a mapping \(M: \theta \to \phi\) for \(\theta \in \mathbb{R}\) and \(\phi \in \mathbb{R}\) defined as follows. Note the choice of \(f_\theta,f_\phi^{direct}\) does not matter here as long as they are valid densities.
| \(\theta\) | \(f_\theta(\theta)\) | \(\phi\) | \(f_\phi^{direct}(\phi)\) |
|---|---|---|---|
| 1 | 0.3 | 1 | 0.4 |
| 2 | 0.2 | 2 | 0.6 |
| 3 | 0.5 | 2 | 0.6 |
We see that \(M\) is not invertible since \(\theta=1\) and \(\theta = 2\) both map to \(\phi=2\), which implies the inverse \(M^{-1}\) would not be well defined.
We can generate a sample from the density \(f_\phi^{induced}(\phi)\) using our function \(M\) and taking \(f_\phi^{induced}(\phi) = f_\theta(M^{-1}(\phi))\); in the continuous case we need to multiply by \(|J(\phi)|\), but not in the discrete case [@blitzsteinIntroductionProbability2019].
So we have \(f_\phi^{induced}(1) = f_\theta(1) = 0.3\) since \(M(1)=1\), and \(f_\phi^{induced}(2) = f_\theta(2) + f_\theta(3) = 0.2 + 0.5=0.7\) since \(M(2) = 2\) and \(M(3)=2\).
Then, we can compute the logarithmically pooled pooled prior with \(\alpha=0.5\) by taking \(f_\phi^{induced}(\phi)^{\alpha} f_\phi^{direct}(\phi)^{1-\alpha}\).
For \(\phi = 1\), we have \(f_\phi^{induced}(\phi)^{\alpha} f_\phi^{direct}(\phi)^{1-\alpha} = (0.3)^{0.5}(0.4)^{0.5} = 0.3464\). For \(\phi = 2\), we have \(f_\phi^{induced}(\phi)^{\alpha} f_\phi^{direct}(\phi)^{1-\alpha} = (0.6)^{0.5}(0.7)^{0.5} = 0.6481\).
To make this a valid density, however, these probabilities must sum to 1, so we renormalize by dividing by (0.3464 + 0.6481). Denoting the pooled prior in phi-space as \(f_\phi^{pooled}(\phi)\), this gives us \(0.3464 / (0.3464 + 0.6481) = 0.3483\) for \(\phi =1\) and \(0.6481 / (0.3464 + 0.6481) =0.6517\) for \(\phi=2\).
Summarizing these results, we have
| \(\phi\) | \(f_\phi^{direct}(\phi)\) | \(f_\phi^{induced}(\phi)\) | \(q^{\sim[\phi]}(\phi)\) |
|---|---|---|---|
| 1 | 0.4 | 0.3 | 0.3483 |
| 2 | 0.6 | 0.7 | 0.6517 |
However, we want the pooled prior on the inputs \(\theta\), that is, \(f_\theta^{pooled}(\theta)\).
Poole et al. reasoned as follows. Since \(M\) uniquely maps \(\theta=1\) to \(\phi =1\), the probability that \(\theta=1\) should be equal to the probability \(\phi = 1\). That is, we should have \(f_\theta^{pooled}(1) = f_\phi^{pooled}(1)\).
However, the relationship for \(\theta=2\) or \(\theta=3\) to \(\phi\) is not one to one. Since \(M(2)=2\) and \(M(3)=2\), the sum of the probabilities for \(\theta=1\) and \(\theta=2\) should be equal to that for \(\phi=2\), that is, \(f_\theta^{pooled}(2) + f_\theta^{pooled}(3) = f_\phi^{pooled}(2) = 0.6517\).
The challenge here is how we divide the probability for \(f_\phi^{pooled}(2)\), which is defined, among \(f_\theta^{pooled}(2)\) and \(f_\theta^{pooled}(3)\). The prior for \(\phi\) yields no information to assist in this choice, because knowing which value \(\phi\) takes on does not give us any information about whether \(\theta=2\) or \(\theta=3\). Thus, the information we have about \(\theta\) must be taken from \(f_\theta(\theta)\).
That is, we can assign a probability for \(f_\theta^{pooled}(2)\) by considering the probability that \(\theta = 2\) relative to the probability \(\theta =3\), computing
\[f_\theta^{pooled}(2) = f_\phi^{pooled}(2) \Big( \frac{f_\theta(2)}{f_\theta(2) + f_\theta(3)}\Big).\]
That is, if the probability \(\theta\) takes on the value \(2\) is lower in this case than the probability \(\theta=3\) which we know from the prior on \(\theta\), \(f_\theta(\theta)\), then the pooled prior on \(\theta\), \(f_\theta^{pooled}(2)\), should reflect this.
Using this reasoning, we have \(f_\theta^{pooled}(2) = (0.7) \frac{0.2}{0.2+0.5} = 0.1862\) and \(f_\theta^{pooled}(3) = (0.7) \frac{0.5}{0.2+0.5} = 0.4655\).
The result in this simple example, using \(f_\theta(\theta)\) to determine how to distribute the probability for values of \(\phi\) where multiple \(\theta\) map to \(\phi\), can be used to derive general formulas to compute \(f_\theta^{pooled}(\theta)\) for discrete and continuous distributions [@poole2000].
General Solution for the Discrete Case
Denote the possible values of \(\theta\) as \(A_1, A_2, \dots\), the possible values of \(\phi\) as \(B_1, B_2, \dots\), and a mapping \(m: \mathbb{N} \to \mathbb{N}\) such that \(M(A_i) = B_{m(i)}\) and \(C_j = M^{-1}(B_j) = \{A_i : M(A_i) = B_j\}\). Then
\[f_\theta^{pooled}(A_i) = f_\phi^{pooled}(B_{m(i)}) \left( \frac{f_\theta(A_i)}{f_\phi^{induced}(B_{m(i)})} \right).\]
General Solution for the Continuous Case
We denote \(B = M(A) = \{M(\theta) : \theta \in A \}\) and \(C = M^{-1}(B) = \{\theta: M(\theta) \in B \}\).
Then
\[ f_\phi^{pooled} (M(\theta)) \left( \frac{f_\theta(\theta)}{f_\phi^{induced}(M(\theta))} \right) \tag{1} \]
\[ =k_{\alpha} f_\theta(\theta) \left( \frac{f_\phi^{direct}(M(\theta))}{f_\phi^{induced}(M(\theta))} \right)^{1-\alpha} \tag{2} \] where \(k_\alpha\) is a renormalizing constant for the choice of \(\alpha\). However, how we get from (1) to (2) is unclear to me.
Implementation through the Sampling-Importance-Resampling Algorithm
The steps are:
We draw \(\theta\) from its prior distribution \(f_\theta(\theta)\), where \(\theta\) can be multidimensional.
For every \(\theta_i\) we compute \(\phi_i = M(\theta_i)\).
Since \(f_\phi^{induced}(\phi)\) is unlikely to have an analytical form, we can compute it via a density approximation by computing \(M(\theta)\) for our sampled values of \(\theta\) and then estimating the density from this sample using kernel density estimation.
Construct weights proportional to the ratio of the prior on \(\phi\) evaluated at \(M(\theta_i)\) to the induced prior \(f_\phi^{induced}\) evaluated at \(M(\theta_i)\). This is applying the same logic as considering \(\dfrac{f_\theta(\theta)}{f_\theta^{induced}(M(\theta))}\), as discussed in the previous discrete example, but representing these probabilities in \(\phi\) space. That is, we let \(w_i = \left( \frac{f_\phi^{direct}(M(\theta_i))}{f_\phi^{induced}(M(\theta_i))} \right)^{1-\alpha} L_1(\theta_i) L_2(M(\theta_i))\). In the implementation of probabilistic bias analysis by Wu et al., the likelihood is not used, and weights are simply taken to be \[w_i = \left( \frac{f_\phi^{direct}(M(\theta_i))}{f_\phi^{induced}(M(\theta_i))} \right)^{1-\alpha}.\]
Sample values from step (1) with probabilities proportional to the weights from (4).
Bayesian Melding Applied to COVID-19 Misclassification
In this work, we can relate the inputs \(\theta = \{P(S_1|untested), \alpha, \beta \}\) and \(\phi = P(S_0|test +,untested)\) by the deterministic model \(M: \theta \to \phi\) given by \[P(S_0|test+, untested) = \frac{\beta(1 - P(S_1|untested))}{\beta(1-P(S_1|untested)) + \alpha P(S_1|untested)}.\] The derivation of \(M\) is in the following section.
Now, we have two distributions on \(\phi\): the distribution based on data on the asymptomatic rate of infection of COVID-19, and the distribution formed by taking \(M(\theta)\) where \(\theta\) represents the values from the defined distributions of \(\alpha,\beta,\) and \(P(S_1|untested)\). With Bayesian melding, we pool these distributions using logarithmic pooling, and then implement the sampling-importance-resampling algorithm to obtain constrained distributions of the inputs \(\theta\) that are in accordance with information about the asymptomatic rate of the virus.
Due to the uncertainty around our definitions of \(\alpha\) and \(\beta\), it is particularly useful to leverage the information we have about the asymptomatic rate of the virus \(P(S_0|test +,untested)\) because a large collection of studies has been published in this area. In a meta-analysis pooling data from 95 studies, the pooled estimate among the confirmed population that was asymptomatic was 40.50% [95% CI, 33.50%-47.50%] [@ma2021]. Another meta-analysis including 350 studies estimated the asymptomatic percentage to be 36.9% [95% CI: 31.8 to 42.4%], and, when restricting to screening studies, 47.3% (95% CI: 34.0% -61.0%) [@sah2021].
This means we have two priors on the asymptomatic rate \(\phi\), that by taking \(M(\theta)\) for sampled values of \(\theta\), denoted \(f_\phi^{induced}(M(\theta))\) in the previous section, and that based on data about the asymptomatic rate, \(f_\phi^{direct}(\phi)\).
Distribution of \(\theta = \{\alpha, \beta, P(S_1|untested) \}\)
First, sampling the values of \(\theta\), we have:
library(latex2exp)
############################################################
# BETA SHAPE PARAMETERS WITH DESIRED MEAN AND SD
############################################################
get_beta_params <- function(mu, sd) {
var = sd^2
alpha <- ((1 - mu) / var - 1 / mu) * mu ^ 2
beta <- alpha * (1 / mu - 1)
return(params = list(alpha = alpha, beta = beta))
}##############################
# DISTRIBUTION OF ALPHA
##############################
# truncated support between a=0, b=0.15
dtrunc_alpha <- function(x) {
shape_params <- get_beta_params(
mu = 0.9,
sd = 0.04)
truncdist::dtrunc(x,spec = "beta",
a = 0.80,
b = 1,
shape1 = shape_params$alpha,
shape2 = shape_params$beta)
}
rtrunc_alpha <- function(x) {
shape_params <- get_beta_params(
mu = 0.9,
sd = 0.04)
truncdist::rtrunc(x,spec = "beta",
a = 0.80,
b = 1,
shape1 = shape_params$alpha,
shape2 = shape_params$beta)
}
##############################
# DISTRIBUTION OF BETA
##############################
# truncated support between a=0.002, b=0.4
dtrunc_beta <- function(x) {
shape_params <- get_beta_params(
mu = 0.15,
sd = 0.09)
truncdist::dtrunc(x,spec = "beta",
a = 0.002,
b = 0.4,
shape1 = shape_params$alpha,
shape2 = shape_params$beta)
}
rtrunc_beta <- function(x) {
shape_params <- get_beta_params(
mu = 0.15,
sd = 0.09)
truncdist::rtrunc(x,
spec = "beta",
a = 0.002,
b = 0.4,
shape1 = shape_params$alpha,
shape2 = shape_params$beta)
}
#############################################
# DISTRIBUTION OF P(S1|untested)
#############################################
# DENSITY
dtrunc_s1_untested <- function(x) {
shape_params <- get_beta_params(
mu = 0.025,
sd = (0.15)^2)
# truncated support between a=0, b=0.15
truncdist::dtrunc(x,spec = "beta",
a = 0,
b = 0.15,
shape1 = shape_params$alpha,
shape2 = shape_params$beta)
}
# SAMPLE FROM DISTRIBUTION
rtrunc_s1_untested <- function(x) {
shape_params <- get_beta_params(
mu = 0.025,
sd = (0.15)^2)
# truncated support between a=0, b=0.15
truncdist::rtrunc(x,spec = "beta",
a = 0,
b = 0.15,
shape1 = shape_params$alpha,
shape2 = shape_params$beta)
}
#####################################
# PLOT THETA ALL TOGETHER
#####################################
tibble(x = seq(0, 1, by = 10^(-4)),
"$\\alpha$" = dtrunc_alpha(x),
"$\\beta$" = dtrunc_beta(x),
"$P(S_1|untested)$" = dtrunc_s1_untested(x)) %>%
pivot_longer(-c(x)) %>%
ggplot(aes(x=x, y = value)) +
geom_area(alpha = .8) +
theme_bw() +
theme(plot.title = element_text(size = 25,
face="bold",
hjust = .5),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
strip.text = element_text(size = 20),
axis.title = element_text(size = 16)
) +
labs(x = "Value",
y = "Density",
title = TeX("Sampling from $\\theta = \\alpha,\\beta, P(S_1|untested)$")
) +
facet_wrap(~name,
labeller= as_labeller(TeX,
default = label_parsed)
)# ggsave("./img/theta.png", dpi = 800)Prior and Induced Prior Distributions for \(P(S_0|test+,untested)\)
Then, taking \(M(\theta)\), we can compute the induced distribution \(f_\phi^{induced}(M(\theta))\) and compare it to our prior on \(\phi\) from meta-analyses on the asymptomatic rate (\(f_\phi^{direct}(\phi)\)).
#######################################################
# PRIOR ON ASYMPTOMATIC RATE P(S_0|test+,untested)
#######################################################
# f_\phi^{direct}(\phi)
# DENSITY
dp_s0_pos <- function(x){
shape_params <- get_beta_params(mu = 0.4,
sd = (0.35)^2)
truncdist::dtrunc(n,
spec = "beta",
a = 0.25,
b = 0.7,
shape1 = shape_params$alpha,
shape2 = shape_params$beta)
}
# DRAW SAMPLE
rp_s0_pos <- function(n){
shape_params <- get_beta_params(mu = 0.4, sd = (0.35)^2)
truncdist::rtrunc(n,
spec = "beta",
a = 0.25,
b = 0.7,
shape1 = shape_params$alpha,
shape2 = shape_params$beta)
}
###############################################################
# INDUCED PRIOR ON ASYMPTOMATIC RATE P(S_0|test+,untested)
###############################################################
# f_\phi^{induced}(\theta)
# input sampled values of theta and compute M(\theta)
est_P_A_testpos = function(P_S_untested, alpha, beta){
beta * (1 - P_S_untested) / (( beta * (1 - P_S_untested)) + (alpha * P_S_untested))
}
nsamp <- 1e5
theta <- tibble(alpha = rtrunc_alpha(nsamp),
beta= rtrunc_beta(nsamp),
P_S_untested = rtrunc_s1_untested(nsamp))
theta <- theta %>%
mutate(phi_induced = est_P_A_testpos(P_S_untested = P_S_untested,
alpha = alpha,
beta=beta))
theta %>%
mutate(phi_prior = rp_s0_pos(nsamp)) %>%
pivot_longer(c(phi_prior,
phi_induced), names_to = "density") %>%
mutate(density = ifelse(density == "phi_induced",
"Induced Prior Distribution",
"Original Prior Distribution")) %>%
ggplot(aes(x=value, fill = density)) +
geom_density(alpha = .8) +
theme_bw() +
theme(legend.position = "right",
legend.text = element_text(size = 14),
plot.title = element_text(size = 26),
axis.text.y = element_blank()) +
scale_fill_manual(values = c("#577C9C", "#56BFA8")) +
labs(title = TeX("$\\overset{Distribution \\, on \\,P(S_0|untested, +)}{and \\, Induced \\, Distribution \\, on \\, P(S_0|untested, +)}$",
bold = TRUE),
fill = "",
y = "Density",
x = "Value")Bayesian Melding
Then, we can apply Bayesian melding to generate constrained distributions for the inputs \(\theta\) and output \(\phi\).
We already have sampled values of \(\theta\) from the prior \(f_\theta(\theta)\) (i.e., the defined
distributions for \(\alpha\), \(\beta\), and \(P(S_1|untested)\)) in the following data
frame. The column phi_induced is \(M(\theta)\), representing sampled values
from the induced distribution \(f_\phi^{induced}(\phi)\).
thetaWe perform a kernel density estimation to approximate the density of \(f_\phi^{induced}(\phi)\).
The density function returns a list where \(x\) is the coordinates of the points where
the density is estimated, and \(y\) is
the estimated density values.
# theta contains values sampled from alpha, beta, P_S_untested, and M(theta) = phi_induced
# induced phi
phi <- theta$phi_induced
# approximate $f_\phi^{induced}(\phi)$ via a density approximation
phi_induced_density <- density(x = phi, n = nsamp, adjust = 2, kernel = "gaussian")
glimpse(phi_induced_density)## List of 7
## $ x : num [1:100000] -0.0341 -0.0341 -0.0341 -0.0341 -0.0341 ...
## $ y : num [1:100000] 4.57e-06 4.57e-06 4.58e-06 4.59e-06 4.60e-06 ...
## $ bw : num 0.0233
## $ n : int 100000
## $ call : language density.default(x = phi, adjust = 2, kernel = "gaussian", n = nsamp)
## $ data.name: chr "phi"
## $ has.na : logi FALSE
## - attr(*, "class")= chr "density"
We see that the kernel density estimation estimates the density at coordinates outside the range of the sampled values of \(\phi\).
tibble(`Coordinates Where Density is Estimated` = phi_induced_density$x,
`Sampled values of Phi Induced` = theta$phi_induced) %>%
pivot_longer(cols=everything()) %>%
ggplot(aes(x=value, fill = name)) +
geom_density(alpha = .6) +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 16),
legend.text = element_text(size = 14)) +
labs(fill = "",
title = TeX(paste0("$\\overset{Comparing\\; Coordinates\\; Where",
"\\;Density \\;is\\; Estimated}{ to \\; Sampled\\; Values\\; of\\; \\phi}$")))This section iterates over every value of the samples from the
induced distribution on \(\phi\) and
finds the coordinate \(x\) closest to
that value of \(\phi\) and stores the
density value at this coordinate. This means the vector
phi_sampled_density represents the densities at the
coordinates closest to the sampled values of \(\phi\) from the induced prior. We verify
this by looking at the extracted coordinates and the sampled values of
\(\phi\).
library(tictoc)
library(future)
##############
# ORIGINAL
##############
tic()
phi_sampled_density <- unlist(parallel::mclapply(
X = phi,
FUN = function(p){
phi_induced_density$y[which(phi_induced_density$x > p)[1]]
}))
toc()## 43.69 sec elapsed
############################
# MAP IMPLEMENTATION
############################
library(future)
library(tictoc)
future::plan(multisession, workers = 3)
tic()
indexes <- furrr::future_map(phi, ~{
which(phi_induced_density$x > .x)[1]
}) %>%
unlist()
toc()## 33.68 sec elapsed
# check that map method is equivalent
all.equal(phi_sampled_density, phi_induced_density$y[indexes])## [1] TRUE
tibble(closest_coords =phi_induced_density$x[indexes],
sampled_induced = theta$phi_induced) %>%
pivot_longer(cols=everything()) %>%
ggplot(aes(x=value, fill = name)) +
geom_density(alpha = .5) +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 16),
legend.text = element_text(size = 14)) +
labs(fill = "",
title = TeX(paste0("$\\overset{Comparing\\; Coordinates\\; Where",
"\\;Density \\;is\\; Estimated}{ to \\; Sampled\\; Values\\; of\\; \\phi}$"))) +
scale_fill_manual(labels = c(
closest_coords = TeX("Closest Coordinates to Sampled Values of $\\phi$"),
sampled_induced = TeX("Sampled values of Induced Distribution for $\\phi$")),
values = c("blue", "yellow"))# not equal, but very close
all.equal(phi_induced_density$x[indexes],theta$phi_induced)## [1] "Mean relative difference: 6.589721e-06"
phi_sampled_density <- phi_induced_density$y[indexes]Now, as described in the section regarding the weights of the Sampling-Importance-Resampling algorithm, the weights are \(w_i = \left( \frac{f_\phi^{direct}(M(\theta_i))}{f_\phi^{induced}(M(\theta_i))} \right)^{1-\alpha}.\)
However, we compute \(weights =\left(
\frac{f_\phi^{induced}(M(\theta_i))}{f_\phi^{direct}(M(\theta_i))}
\right)^{1-\alpha}\) and then account for this with the argument
prob=1/weights when sampling with these weights.
Here, phi_sampled_density represents \(f_\phi^{induced}(\phi)\). It is the density
at (approximately) the sampled values of \(\phi\) where the density \(f_\phi^{direct}\) is being evaluated.
This means applying the function
(phi_sampled_density_i/ dp_s0_pos(phi_i))^(1-alpha) to each
component of the phi vector is computing the density
(approximately) at each value of phi from the induced
density \(f_\phi^{induced}(\phi)\) and
dividing this by the prior density \(f_\phi^{direct}(\phi)\) evaluated at this
value of phi.
dp_s0_pos <- function(x){
truncdist::dtrunc(x = x,spec = "beta",a = 0.25,b = 0.7,
shape1 = get_beta_params(mu = 0.4, sd = (0.35)^2)$alpha,
shape2 = get_beta_params(mu = 0.4, sd = (0.35)^2)$beta)
}
rp_s0_pos <- function(x){
truncdist::rtrunc(x,spec = "beta",a = 0.25,b = 0.7,
shape1 = get_beta_params(mu = 0.4, sd = (0.35)^2)$alpha,
shape2 = get_beta_params(mu = 0.4, sd = (0.35)^2)$beta)
}
# dp_s0_pos = P(A|test+, untested)
################
# ORIGINAL
################
weights <- parallel::mcmapply(
FUN = function(p,
phi_sampled_density,
alpha){
(phi_sampled_density/ dp_s0_pos(p))^(1-alpha)
},
p=phi,
phi_sampled_density=phi_sampled_density,
MoreArgs = list(alpha=0.5))
################
# REWRITTEN
################
weights2 <- map2_dbl(
phi_sampled_density,
phi,
function(phi_sampled_density_i, phi_i) {
# pooling weight
alpha = .5
(phi_sampled_density_i/ dp_s0_pos(phi_i))^(1-alpha)
}
)
weights3 <- (phi_sampled_density/ dp_s0_pos(phi))^(.5)
all.equal(weights2,weights3)## [1] TRUE
# check equivalence
all.equal(weights,weights2)## [1] TRUE
Once we have these weights, we resample the posterior.
nsamp <- 1e5
# resample the posterior
nsamp_post <- 1e5 # number of samples from the posterior
post_samp_ind <-sample.int(n=nsamp,
size=nsamp_post,
prob=1/weights,
replace=T)
pi_samp <- cbind(theta[post_samp_ind,],
P_A_testpos = phi[post_samp_ind]) %>%
select(-phi_induced)
pi_samp_long <- pi_samp %>%
pivot_longer(cols=everything()) %>%
mutate(type = "After Melding")
compare_melded <- theta %>%
mutate(P_A_testpos = rp_s0_pos(nsamp)) %>%
pivot_longer(cols=everything()) %>%
mutate(type = ifelse(
name == "phi_induced",
"Induced", "Before Melding")) %>%
mutate(name = ifelse(name == "phi_induced", "P_A_testpos", name)) %>%
bind_rows(pi_samp_long) %>%
mutate(name = case_when(
name == "alpha" ~"$\\alpha$",
name == "beta" ~"$\\beta$",
name == "P_A_testpos" ~ "$P(S_0|test+,untested)$",
name == "P_S_untested" ~ "$P(S_1|untested)$")
) %>%
mutate(name = factor(name,
levels = c(
"$\\alpha$",
"$\\beta$",
"$P(S_1|untested)$",
"$P(S_0|test+,untested)$")))
library(cowplot)
p <- compare_melded %>%
ggplot(aes(x = value, fill = type)) +
geom_density(alpha = .5) +
facet_wrap(~name,
labeller = as_labeller(
TeX,
default = label_parsed), ncol = 3) +
theme_c(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title = element_text(size = 18),
axis.text.x = element_text(size = 12),
plot.title =element_text(size = 25, margin =margin(.5,.5,.5,.5)),
strip.text = element_text(size = 16),
legend.text = element_text(size = 16)) +
labs(
fill = "",
y = "Density") +
scale_fill_manual(values = c("#5670BF", "#418F6A", "#B28542"))
p1 <- p %+% subset(compare_melded,
name %in% c( "$\\alpha$",
"$\\beta$",
"$P(S_1|untested)$")) %+%
theme(legend.position = "none")
p2 <- p %+% subset(compare_melded,
name == "$P(S_0|test+,untested)$")
legend <- get_legend(
p1
)
cowplot::plot_grid(
p1,
p2,
nrow = 2
)# ggsave("./img/melded.png", dpi = 700)More on the Intuition for the Weights
Formal Definition of Logarithmic Pooling
As outlined in Carvalho et al., we can formally define logarithmic pooling as follows [@carvalho2023].
If we have a set of densities \(\{ f_1(\phi), f_2(\phi), \ldots, f_n(\phi)\}\) and corresponding pooling weights \(\boldsymbol{\alpha}=\{\alpha_1, \alpha_2, \ldots, \alpha_n\}\), then the pooled density is
\[ t(\boldsymbol{\alpha}) \prod_{i=0}^n f_i(\phi)^{\alpha_i}\] where \(t(\boldsymbol{\alpha})\) is the normalizing constant \(t(\boldsymbol{\alpha}) = \dfrac{1}{ \int_{\Phi}\prod_{i=0}^n f_i(\phi)^{\alpha_i} d\phi}\) to ensure the pooled density is a valid probability density.
The case for this work is more simple: we only have two densities we wish to pool, \(f_\phi^{induced}\) and \(f_\phi^{direct}\), and we assign them equal weights by letting \(\boldsymbol{\alpha} = \{.5, .5\}\). This yields
\[f^{pooled}(\phi) = \left( f^{induced} (\phi) \right)^{0.5} \left( f^{direct} (\phi) \right)^{0.5}.\]
Theorem 1: Distribution After Resampling with Sampling Weights
After some searching, I found a formulation of this problem here.
Suppose we have a random variable \(X\) with PDF \(f_X\) and are sampling with weights \(h(x)\) for a nonnegative function \(h\). We are interested in determining the distribution after sampling.
Letting \(U\) denote the event that \(X\) was sampled, the probability that a given value of \(X\) is sampled is \(P(U=1|X=x) = h(x)\). We want the post-sampling distribution, that is, \(P(X \leq z | U = 1)\).
By the definition of conditional probability we have \(P(X \leq z | U = 1) = \dfrac{P(X \leq z , U = 1)}{ P(U=1)}\),
and since we will always be sampling at least some values, \(P(U=1)\) will always be a nonzero constant.
This gives us
\[P(X \leq z | U = 1) = \dfrac{1}{ P(U=1)} P(X \leq z , U = 1) \tag{1}.\] Since the joint probability is \[P(X \leq z, U= 1) = P(U=1|X\leq z) \; P(X\leq z),\] we write it as the integral \[P(X \leq z, U = 1) = \int^z P(U=1|X=x) \; f_X(x) \; dx.\]
Substituting this expression of the joint probability into (1), we have
\[P(X \leq z | U = 1) = \dfrac{1}{ P(U=1)} \int^z P(U=1|X=x) \; f_X(x) \; dx.\]
As defined, \(h(x) = P(U=1|X=x)\), so we have
\[P(X \leq z | U = 1) = \dfrac{1}{ P(U=1)} \int^z h(x) \; f_X(x) \; dx.\] \[ \propto \int^z h(x) \; f_X(x) \; dx.\]
Demonstrative Examples
The most trivial case is when \(h(x) =1\), and as such, we sample all \(X\) with equal weight, obtaining the original distribution.
However, there are less trivial examples where \(P(X \leq z | U = 1)\) is in the form of a recognizable distribution.
Example 1:
For example, suppose \(X \sim Exp(\lambda)\), so \(f_X(x) = \lambda e^{-\lambda x}\), and we sample with weights direction proportional to \(X\), that is, \(h(x) = x\).
Then \[P(X \leq z | U = 1) \propto \int^z h(x) \; f_X(x) \; dx = \int^z x \cdot \lambda e^{-\lambda x} dx\] and from the PDF of the gamma distribution, \(f(x) = \dfrac{\beta^\alpha}{\Gamma(\alpha) }x^{\alpha - 1} e^{-\beta x}\) we can recognized that \(x \cdot e^{-\lambda x}\) corresponds to the gamma distribution with \(\alpha = 2\) and \(\beta = \lambda\).
We can see this result by considering \(X\) before and after sampling below.
library(latex2exp)
library(viridis)
nsamp <- 1e6
##########################################
# EXAMPLE 1
##########################################
# weights are proportional to random variable itself
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=phi_sim)
dat <- tibble(`Before Sampling` = phi_sim,
`After Sampling` = phi_sim[post_samp_ind]) %>%
pivot_longer(cols = everything()) %>%
mutate(name = factor(name,
levels = c("Before Sampling",
"After Sampling")))
##########################################
# PLOT *NOT* INCLUDING THEORETICAL DENSITY
##########################################
dat %>%
# filter(name == "post_melding") %>%
ggplot() +
geom_density(aes(x=value, fill = name),
alpha = .6,
color = NA) +
labs(title = TeX(
paste0("Sampling from $X \\sim Exp(\\lambda = ",
input_rate, ")$ with Weights $h(x) =x$")),
# subtitle =TeX(paste0("PDF of Gamma$(2, \\lambda)$ in red")),
fill = "",
y= "Density") +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 18),
plot.subtitle = element_text(hjust = .5, size = 16),
legend.text = element_text(size = 18)) +
scale_fill_manual(values = c("#0C2B67", "#DE8600")) Then, we can see that the PDF of the the gamma distribution with \(\alpha = 2\) and \(\beta = \lambda\) corresponds to the post-sampling distribution as expected.
library(ggtext)
##########################################
# PLOT INCLUDING THEORETICAL DENSITY
##########################################
color_lab <- paste0("PDF of Gamma(2, ", input_rate, ")")
dat %>%
ggplot() +
geom_density(aes(x=value, fill = name),
alpha = .6,
color = NA) +
labs(title = TeX(
paste0("Sampling from $X \\sim Exp(\\lambda = ",
input_rate, ")$ with Weights $h(x) =x$")),
subtitle =(paste0("PDF of Gamma(2, ",
input_rate,
") in <span style = 'color:darkred'>Red</span>")),
fill = "",
y = "Density") +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 18),
plot.subtitle = element_markdown(hjust = .5, size = 16),
legend.text = element_text(size = 18)) +
stat_function(fun = dgamma,
args=list(shape=2,
scale=1/input_rate),
aes(color = color_lab),
size = 1.2) +
scale_color_manual(name = "",
values = c("darkred")) +
scale_fill_manual(values = c("#0C2B67", "#DE8600")) +
guides(color = guide_legend(
override.aes = list(size = 4)))Example 2:
Similarly, again suppose \(X \sim Exp(\lambda)\), so \(f_X(x) = \lambda e^{-\lambda x}\). However, now we sample with weights defined by \(h(x)= e^{-\lambda x}\). Then \[P(X \leq z | U = 1) \propto \int^z h(x) \; f_X(x) \; dx = \int^z e^{-\lambda x} \cdot \lambda e^{-\lambda x} dx\] \[= \int^z \lambda e^{-2 \lambda x} dx\] which is proportional to the exponential distribution with parameter \(2\lambda\).
Plotting \(X\) before and after sampling, we have:
################################################################
# EXAMPLE 2
################################################################
# weights are proportional to exp(-rate * random_variable)
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=exp(phi_sim*-1*input_rate))
dat <- tibble(`Before Sampling` = phi_sim,
`After Sampling` = phi_sim[post_samp_ind]) %>%
pivot_longer(cols = everything()) %>%
mutate(name = factor(name,
levels = c("Before Sampling",
"After Sampling")))
library(latex2exp)
##########################################
# PLOT *NOT* INCLUDING THEORETICAL DENSITY
##########################################
dat %>%
ggplot() +
geom_density(aes(x=value, fill = name),
alpha = .6,
color = NA) +
# stat_function(fun = dexp,
# args=list(rate = 2*input_rate),
# color = "red") +
labs(title = TeX(
paste0("Sampling from $X \\sim Exp(\\lambda=", input_rate, ") $ with Weights ",
"$e^{-\\lambda \\; x}$")),
#subtitle =TeX("PDF of $Exp(2*\\lambda)$ in Red"),
fill = "",
y = "Density") +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 18),
plot.subtitle = element_text(hjust = .5, size = 16),
legend.text = element_text(size = 18)) +
scale_fill_manual(values = c("#0C2B67", "#DE8600"))and then plotting the PDF of the exponential distribution with parameter \(2\lambda\) we can see the correspondence to the post-sampling distribution.
color_lab <- paste0("PDF of Exp(2 * ", input_rate, ")")
##########################################
# PLOT INCLUDING THEORETICAL DENSITY
##########################################
dat %>%
ggplot() +
geom_density(aes(x=value, fill = name),
alpha = .6,
color = NA) +
stat_function(fun = dexp,
args=list(rate = 2*input_rate),
# color = "darkred",
size=1.2,
aes(color = color_lab)) +
labs(title = TeX(
paste0("Sampling from $X \\sim Exp(\\lambda=",
input_rate,
") $ with Weights ",
"$e^{-\\lambda \\; x}$")),
subtitle =TeX("PDF of $Exp(2*\\lambda)$ in Red"),
fill = "",
y = "Density") +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 18),
plot.subtitle = element_text(hjust = .5, size = 16),
legend.text = element_text(size = 18)) +
scale_fill_manual(values = c("#0C2B67", "#DE8600")) +
scale_color_manual(name = "",
values = c("darkred")) +
guides(color = guide_legend(
override.aes = list(size = 4)))Justification for Melding Implementation
In the implementation of melding in this work, we have \(\phi = \{ \phi_1, \ldots, \phi_n \}\) is a vector containing a sample where \(\phi_i \sim f_{\phi}^{induced}\).
Considering the \(i^{th}\) element \(\phi_i\) , the density is \(f_{\phi}^{induced}(\phi_i)\).
With sample.int, we sample the index \(i\) with weight \(\left(\dfrac{f_\phi^{direct}(\phi_i)}{f_{\phi}^{induced}(\phi_i)}\right)^{0.5}\).
That is, \(h(\phi_i) =
\left(\dfrac{f_\phi^{direct}(\phi_i)}{f_{\phi}^{induced}(\phi_i)}\right)^{0.5}\).
We want to define the distribution of the \(\phi\) after resampling with these weights, denoted \(f_\phi^{post}\).
By Theorem 1, the density at \(f_\phi^{post}(\phi_i)\) then is \[f_\phi^{post} (\phi_i) = \left( h(\phi_i) \right) \; \left( f_{\phi}^{induced}(\phi_i)\right)\] and plugging in our expression of \(h\) we have
\[f_\phi^{post} (\phi_i) = \left(\dfrac{f_\phi^{direct}(\phi_i)}{f_{\phi}^{induced}(\phi_i)}\right)^{0.5} \; \left( f_{\phi}^{induced}(\phi_i)\right).\] Simplifying, we have
\[f_\phi^{post} (\phi_i) = \left({f_\phi^{direct}(\phi_i)}\right)^{0.5} \; \left( f_{\phi}^{induced}(\phi_i)\right)^{0.5}.\]
Hence, the density after resampling is the log-pooled density of \(f_\phi^{direct}\) and \(f_\phi^{induced}\).
library(latex2exp)
nsamp <- 1e5
######################################################
# EXAMPLE 1: ALL INDEXES GIVEN SAME WEIGHT
######################################################
# h(x) = 1
# suppose we know the distribution of phi
phi_sim <- rnorm(nsamp, mean =0, sd =1)
# tibble(x = runif(nsamp, 0, 1)) %>%
# ggplot(aes(x=x)) + geom_histogram()
# dunif(.5, 0, 1)
# simplest example, indexes all given equal weight (uniform 0 to 1)
post_samp_ind <-sample.int(n=nsamp,
size=nsamp,
prob=rep(1,nsamp),
replace=T)
tibble(pre_melding = phi_sim,
post_melding = phi_sim[post_samp_ind]) %>%
pivot_longer(cols = everything()) %>%
ggplot(aes(x=value, fill = name)) +
geom_density(alpha = .5)
# density of any point phi_i is dunif(ind_i, 0, 1) * dnorm(phi_1, 0, 1)
################################################################
# EXAMPLE 2
################################################################
# weights are proportional to random variable itself
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=phi_sim)
dat <- tibble(pre_sampling = phi_sim,
post_sampling = phi_sim[post_samp_ind]) %>%
pivot_longer(cols = everything())
dat %>%
# filter(name == "post_melding") %>%
ggplot() +
geom_density(aes(x=value, fill = name),
alpha = .5) +
stat_function(fun = dgamma, args=list(shape=2, scale=1/input_rate), color = "red")
################################################################
# EXAMPLE 3
################################################################
# weights are proportional to exp(-rate * random_variable)
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=exp(phi_sim*-1*input_rate))
dat <- tibble(pre_sampling = phi_sim,
post_sampling = phi_sim[post_samp_ind]) %>%
pivot_longer(cols = everything())
library(latex2exp)
dat %>%
ggplot() +
geom_density(aes(x=value, fill = name),
alpha = .5,
color = NA) +
stat_function(fun = dexp,
args=list(rate = 2*input_rate),
color = "red") +
labs(title = TeX(
paste0("Sampling from $X \\sim Exp(\\lambda) $ with Weights Proportional to ",
"$e^{-\\lambda \\; x}$ with $\\lambda$ = ", input_rate)),
subtitle =TeX("PDF of $Exp(2*\\lambda)$ in red"),
fill = "") +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 16),
plot.subtitle = element_text(hjust = .5, size = 16),
legend.text = element_text(size = 18))
################################################################
# EXAMPLE 2: INDEXES DISTRIBUTED BETA(shape1 = 2, shape2 = 2)
################################################################
phi_sim <- rnorm(nsamp, mean = 4, sd = 30)
tibble(x=phi_sim) %>%
ggplot(aes(x=x)) +
geom_density()
weights <- dbeta(seq(0,1, length = nsamp), shape1=12,shape2=12)
ggplot() +
stat_function(fun = dbeta, args =list(shape1=12,shape2=12))
post_samp_ind <-sample.int(n=nsamp,
size=nsamp,
prob=weights,
replace=T)
df <- tibble(x = phi_sim,
weights = weights,
dens = dnorm(phi_sim, mean = 4, sd = 30),
y = weights*dens) %>%
slice_sample(n = 5000)
dat <- tibble(pre_melding = phi_sim,
post_melding = phi_sim[post_samp_ind]) %>%
pivot_longer(cols = everything())
density(dat$post_melding, n=nsamp)
ggplot() +
geom_point(data=df,
aes(x=x, y=y ),
color = "red",
inherit.aes=FALSE)
dat %>%
# filter(name == "post_melding") %>%
ggplot() +
geom_point(data=df,
aes(x=x, y=y ),
color = "red",
inherit.aes=FALSE) +
geom_density(aes(x=value, fill = name),
alpha = .5)
# helpful
# https://stackoverflow.com/questions/59918865/what-happens-when-prob-argument-in-sample-sums-to-less-greater-than-1Summary of the Implementation Procedure
In the implementation of Bayesian melding, we sample values of \(\theta = \{ \alpha, \beta, P(S_1|untested)\}\). Then, we use the function \(M: \theta \to \phi\) defined by \(M(\theta) = \dfrac{\beta(1- P(S_1|\text{untested}))}{\beta(1-P(S_1|untested)) + \alpha P(S_1|untested)}\), yielding the induced distribution \(f_\phi^{induced}\). Because we do not have an analytical form for \(f_\phi^{induced}\), we must estimate it using some density estimation approach. Kernel density estimation is standard for this step.
Then, we compute the sampling weights \(\left(\dfrac{f_\phi^{direct}(\phi_i)}{f_{\phi}^{induced}(\phi_i)}\right)^{0.5}\). Here, \({f_\phi^{direct}(\phi_i)\) is calculated using the density function \(f_\phi^{direct}\), which is defined based on information from meta-analyses on the asymptomatic rate, and \(f_{\phi}^{induced}(\phi_i)\) is based on kernel density estimates. Sampling with these weights from our sample of \(M(\theta) = \phi\) and our sample of \(\theta\) yields the post-melding distributions, where the post-melding density of \(\phi\) is the the logarithmic pool of \(f_\phi^{direct}\) and \(f_\phi^{induced}\), that is, \(f_\phi^{post} (\phi_i) = \left({f_\phi^{direct}(\phi_i)}\right)^{0.5} \; \left( f_{\phi}^{induced}(\phi_i)\right)^{0.5}.\)
nsamp <- 1e6
theta <- tibble(alpha = rtrunc_alpha(nsamp),
beta= rtrunc_beta(nsamp),
P_S_untested = rtrunc_s1_untested(nsamp))
theta <- theta %>%
mutate(phi_induced = est_P_A_testpos(P_S_untested = P_S_untested,
alpha = alpha,
beta=beta))
phi <- theta$phi_induced
phi_induced_density <- density(x = phi, n = nsamp, adjust = 2, kernel = "gaussian")
############################
# MAP IMPLEMENTATION
############################
future::plan(multisession, workers = 3)
indexes <- furrr::future_map(phi, ~{
which(phi_induced_density$x > .x)[1]
}) %>%
unlist()
phi_sampled_density <- phi_induced_density$y[indexes]
weights3 <- (phi_sampled_density/ dp_s0_pos(phi))^(1-.5)
nsamp <- 1e6
post_samp_ind <-sample.int(n=nsamp,
size=nsamp,
prob=phi_sampled_density,
replace=T)
tibble(after = phi[post_samp_ind],
before = phi) %>%
pivot_longer(cols= everything()) %>%
ggplot(aes(x = value, fill = name)) +
geom_density()
post_samp_ind <-sample.int(n=nsamp,
size=nsamp_post,
prob=1/weights,
replace=T)
pi_samp <- cbind(theta[post_samp_ind,],
P_A_testpos = phi[post_samp_ind]) %>%
select(-phi_induced)More on Importance Sampling Resampling
Setting Pre and Post Number of Samples for Simple Example
nsamp <- 1e6
##########################################
# EXAMPLE 1
##########################################
nsamp <- 1e4
approx_target <- function(pre_nsamp, post_nsamp, input_rate) {
set.seed(123)
phi_sim <- rexp(pre_nsamp, rate = input_rate)
# weights proportional to random variable itself
set.seed(123)
post_samp_ind <- sample.int(n = pre_nsamp, size = post_nsamp, replace=TRUE, prob=phi_sim)
message(paste0("Post: ", length(post_samp_ind), ", Pre: ", length(phi_sim)))
tibble(name = "Before Sampling",
value = phi_sim) %>%
bind_rows(
tibble(name = "After Sampling",
value = phi_sim[post_samp_ind])
)
# tibble(`Before Sampling` = phi_sim,
# `After Sampling` = phi_sim[post_samp_ind]) %>%
# pivot_longer(cols = everything()) %>%
# mutate(name = factor(name,
# levels = c("Before Sampling",
# "After Sampling")))
}
compare_theoretical <- function(melded, input_rate, pre_nsamp, post_nsamp) {
color_lab <- paste0("PDF of Gamma(2, ", input_rate, ")")
melded %>%
ggplot() +
geom_density(aes(x=value, fill = name),
alpha = .6,
color = NA) +
labs(title = TeX(
paste0("Sampling from $X \\sim Exp(\\lambda = ",
input_rate, ")$ with Weights $h(x) =x$")),
subtitle =(paste0("PDF of Gamma(2, ",
input_rate,
") in <span style = 'color:darkred'>Red</span>",
", Sample Size: ", pre_nsamp,
", Posterior Sample Size:", post_nsamp)),
fill = "",
y = "Density") +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 18),
plot.subtitle = element_markdown(hjust = .5, size = 16),
legend.text = element_text(size = 18)) +
stat_function(fun = dgamma,
args=list(shape=2,
scale=1/input_rate),
aes(color = color_lab),
size = .9) +
scale_color_manual(name = "",
values = c("darkred")) +
scale_fill_manual(values = c("#0C2B67", "#DE8600")) +
guides(color = guide_legend(
override.aes = list(size = 4)))
}
rate <- .2
# melded <- approx_target(presamp = 1e5,
# postsamp = 1e4,
# input_rate = rate)
# compare_theoretical(melded, input_rate = rate)
walk(c(1e4, 5e4, 1e5, 1e6),~{
melded_df <- approx_target(pre_nsamp = .x,
post_nsamp = 1e4,
input_rate = rate)
plt <- compare_theoretical(melded_df, input_rate = rate,
pre_nsamp = .x,
post_nsamp = 1e4) +
xlim(0,55) +
ylim(0, .18)
print(plt)
})walk(c(1e4, 5e4, 1e5, 1e6),~{
melded_df <- approx_target(pre_nsamp = .x,
post_nsamp = 1e4,
input_rate = rate) %>%
filter(name == "After Sampling")
plt <- compare_theoretical(melded_df, input_rate = rate,
pre_nsamp = .x,
post_nsamp = 1e4) +
xlim(0,55) +
ylim(0, .075)
print(plt)
})# weights are proportional to random variable itself
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=phi_sim)
dat <- tibble(`Before Sampling` = phi_sim,
`After Sampling` = phi_sim[post_samp_ind]) %>%
pivot_longer(cols = everything()) %>%
mutate(name = factor(name,
levels = c("Before Sampling",
"After Sampling")))
##########################################
# PLOT *NOT* INCLUDING THEORETICAL DENSITY
##########################################
dat %>%
# filter(name == "post_melding") %>%
ggplot() +
geom_density(aes(x=value, fill = name),
alpha = .6,
color = NA) +
labs(title = TeX(
paste0("Sampling from $X \\sim Exp(\\lambda = ",
input_rate, ")$ with Weights $h(x) =x$")),
# subtitle =TeX(paste0("PDF of Gamma$(2, \\lambda)$ in red")),
fill = "",
y= "Density") +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 18),
plot.subtitle = element_text(hjust = .5, size = 16),
legend.text = element_text(size = 18)) +
scale_fill_manual(values = c("#0C2B67", "#DE8600"))
##########################################
# PLOT INCLUDING THEORETICAL DENSITY
##########################################
color_lab <- paste0("PDF of Gamma(2, ", input_rate, ")")
dat %>%
ggplot() +
geom_density(aes(x=value, fill = name),
alpha = .6,
color = NA) +
labs(title = TeX(
paste0("Sampling from $X \\sim Exp(\\lambda = ",
input_rate, ")$ with Weights $h(x) =x$")),
subtitle =(paste0("PDF of Gamma(2, ",
input_rate,
") in <span style = 'color:darkred'>Red</span>")),
fill = "",
y = "Density") +
theme_bw() +
theme(plot.title = element_text(hjust = .5, size = 18),
plot.subtitle = element_markdown(hjust = .5, size = 16),
legend.text = element_text(size = 18)) +
stat_function(fun = dgamma,
args=list(shape=2,
scale=1/input_rate),
aes(color = color_lab),
size = 1.2) +
scale_color_manual(name = "",
values = c("darkred")) +
scale_fill_manual(values = c("#0C2B67", "#DE8600")) +
guides(color = guide_legend(
override.aes = list(size = 4)))Setting Pre and Post Number of Samples for COVID-19 Misclassification
###############################################################
# BETA PARAMETERS FROM DESIRED MEAN AND VARIANCE
###############################################################
get_beta_params <- function(mu, sd) {
var = sd^2
alpha <- ((1 - mu) / var - 1 / mu) * mu ^ 2
beta <- alpha * (1 / mu - 1)
return(params = list(alpha = alpha,
beta = beta))
}
###############################################################
# BETA DENSITY WITH DESIRED MEAN AND VARIANCE
###############################################################
beta_density <- function(x, mean, sd, bounds=NA) {
shape_params <- get_beta_params(
mu = mean,
sd = sd)
if(!length(bounds) == 1){
# message("here")
dtrunc(x,
spec = "beta",
a = bounds[1],
b = bounds[2],
shape1 = shape_params$alpha,
shape2 = shape_params$beta) %>%
return()
}else{
dbeta(x,
shape1 = shape_params$alpha,
shape2 = shape_params$beta) %>%
return()
}
}
###############################################################
# SAMPLE FROM BETA DENSITY WITH DESIRED MEAN AND VARIANCE
###############################################################
sample_beta_density <- function(n, mean, sd, bounds = NA) {
shape_params <- get_beta_params(
mu = mean,
sd = sd)
rbeta(n,
shape1 = shape_params$alpha,
shape2 = shape_params$beta)
if(!length(bounds) == 1){
# message("here")
rtrunc(n,
spec = "beta",
a = bounds[1],
b = bounds[2],
shape1 = shape_params$alpha,
shape2 = shape_params$beta) %>%
return()
}else{
rbeta(n,
shape1 = shape_params$alpha,
shape2 = shape_params$beta) %>%
return()
}
}
###############################################################
# GAMMA PARAMETERS FROM DESIRED MEAN AND VARIANCE
###############################################################
get_gamma_params <- function(mu, sd) {
var = (mu/sd)^2
shape = (mu/sd)^2
scale = sd^2/mu
return(params = list(shape = shape,
scale = scale))
}
###############################################################
# GAMMA DENSITY WITH DESIRED MEAN AND VARIANCE
###############################################################
gamma_density <- function(x, mean, sd, bounds=NA) {
shape_params <- get_gamma_params(
mu = mean,
sd = sd)
if(!length(bounds) == 1){
#message("here")
dtrunc(x,
spec = "gamma",
a = bounds[1],
b = bounds[2],
shape = shape_params$shape,
scale = shape_params$scale) %>%
return()
}else{
dgamma(x,
shape = shape_params$shape,
scale = shape_params$scale) %>%
return()
}
}
sample_gamma_density <- function(n, mean, sd, bounds = NA) {
shape_params <- get_gamma_params(
mu = mean,
sd = sd)
if(!length(bounds) == 1){
#message("here")
rtrunc(n,
spec = "gamma",
a = bounds[1],
b = bounds[2],
shape = shape_params$shape,
scale = shape_params$scale) %>%
return()
}else{
rgamma(n,
shape = shape_params$shape,
scale = shape_params$scale) %>%
return()
}
}
###############################################################
# INDUCED PRIOR ON ASYMPTOMATIC RATE P(S_0|test+,untested)
###############################################################
# input sampled values of theta and compute M(\theta)
est_P_A_testpos = function(P_S_untested, alpha, beta){
(beta * (1 - P_S_untested)) / (( beta * (1 - P_S_untested)) + (alpha * P_S_untested))
}
##########################################
# County Correction Functions -----------
##########################################
#' Add sensitivity and specificity
process_priors_per_county <- function(priors, df){
dist_Se <- truncdist::rtrunc(n = 1e5,spec = "beta",a = 0.65,b = 1,
shape1 = get_beta_params(mu = 0.8,
sd = (0.4)^2)$alpha,
shape2 = get_beta_params(mu = 0.8,
sd = (0.4)^2)$beta)
dist_Sp <- truncdist::rtrunc(n = 1e5,spec = "beta",a = 0.9998,b = 1,
shape1 = get_beta_params(mu = 0.99995,
sd = (0.01)^2)$alpha,
shape2 = get_beta_params(mu = 0.99995,
sd = (0.01)^2)$beta)
priors_out <- priors %>%
mutate(
# calculate P(+|S_1, untested) and P(+|S_0, untested)
P_testpos_S = priors$Z_S * df$posrate,
P_testpos_A = priors$Z_A * df$posrate,
empirical_testpos = df$posrate,
population = df$population,
total = df$total,
positive = df$positive,
negative = df$negative) %>%
mutate(Se = dist_Se,
Sp = dist_Sp,
biweek = df$biweek,
fips = df$fips) %>%
# compute with constrained priors
mutate(P_A_testpos = est_P_A_testpos(
P_S_untested = priors$P_S_untested,
alpha = priors$Z_S,
beta = priors$Z_A))
return(priors_out)
}
#' correct for test inaccuracy
calc_A_star <- function(N, N_tested,
N_pos_obs,
P_testpos_est,
P_S_untested,
P_A_testpos,
Z_S,
Z_A,
Se,
Sp){
N_untested = N - N_tested
# number symptomatic and asymptomatic among tested
Npos_tested_S = N_pos_obs * (1 - P_A_testpos)
Npos_tested_A = N_pos_obs - Npos_tested_S
# prob testpos among untested
P_testpos_S = P_testpos_est * Z_S
P_testpos_A = P_testpos_est * Z_A
# estimate number of positives among untested
Npos_untested_S = P_S_untested * N_untested * P_testpos_S
Npos_untested_A = (1 - P_S_untested) * N_untested * P_testpos_A
A_star = Npos_tested_S + Npos_tested_A +
Npos_untested_S + Npos_untested_A
# correct for imperfect sensitivity and specificity
A = (A_star - ((1 - Sp) * N)) / (Se + Sp - 1)
return(max(A,0))
}
#' compute corrected counts for randomly sampled combination of parameters
correct_bias <- function(priors_by_county_df){
# N, N_tested, N_pos_obs, P_testpos_est,
# sample index to draw from distribution
sample_ind = sample(1:nrow(priors_by_county_df),
size = 1,
replace=TRUE)
# randomly sample from each distribution
sampled_priors = priors_by_county_df[sample_ind,]
# corrected case count
Astar = calc_A_star(N = sampled_priors$population,
N_tested =sampled_priors$total,
N_pos_obs = sampled_priors$positive,
P_testpos_est = sampled_priors$empirical_testpos,
P_S_untested = sampled_priors$P_S_untested,
P_A_testpos = sampled_priors$P_A_testpos,
Z_S = sampled_priors$Z_S,
Z_A = sampled_priors$Z_A,
Se = sampled_priors$Se,
Sp = sampled_priors$Sp
)
out = cbind(sampled_priors, exp_cases=Astar)
return(out)
}
#' generate distribution of corrected counts
generate_corrected_sample = function(priors_by_county_df,
num_reps){
# Obtain corrected case estimates
reps = num_reps
# need to set seed here to ensure that the same random draws are
# used for a given time period / location with same priors
set.seed(123)
# perform probabilistic bias analysis
result = replicate(reps, correct_bias(priors_by_county_df ), simplify=FALSE) %>%
bind_rows()
return(result)
}
#' summarize distribution of corrected counts
summarize_corrected_sample <- function(priors_by_county_df_exp_cases) {
summarized <- tibble(
exp_cases_median = median(priors_by_county_df_exp_cases$exp_cases),
exp_cases_lb = quantile(priors_by_county_df_exp_cases$exp_cases,
prob = 0.025) %>% unlist(),
exp_cases_ub = quantile(priors_by_county_df_exp_cases$exp_cases,
prob = 0.975) %>% unlist(),
biweek = unique(priors_by_county_df_exp_cases$biweek),
fips = unique(priors_by_county_df_exp_cases$fips),
empirical_testpos = unique(priors_by_county_df_exp_cases$empirical_testpos),
population = unique(priors_by_county_df_exp_cases$population),
negative = unique(priors_by_county_df_exp_cases$negative),
positive = unique(priors_by_county_df_exp_cases$positive),
total = unique(priors_by_county_df_exp_cases$total))
return(summarized)
}
get_corrected_counts <- function(county_df, melded_df) {
melded_df <- melded_df %>%
rename(Z_S = alpha,
Z_A = beta)
corrected <- pmap_df(county_df, ~ {
process_priors_per_county(
priors = melded_df,
df = list(...)) %>%
generate_corrected_sample(., num_reps = 1e3) %>%
summarize_corrected_sample() })
corrected %>%
left_join(dates)
}
get_melded <- function(alpha_mean = 0.9,
alpha_sd = 0.04,
alpha_bounds = NA,
beta_mean = .15,
beta_sd =.09,
beta_bounds = NA,
s_untested_mean = .025,
s_untested_sd = .0225,
s_untested_bounds = NA,
p_s0_pos_mean = .4,
p_s0_pos_sd = .1225,
p_s0_pos_bounds = NA,
nsamp = 1e3) {
given_args <- as.list(environment())
# cat("Arguments to get_melded:\n")
# print(given_args)
theta <- tibble(alpha = sample_gamma_density(nsamp,
mean = alpha_mean,
sd = alpha_sd,
bounds = alpha_bounds),
beta= sample_beta_density(nsamp,
mean = beta_mean,
sd = beta_sd,
bounds = beta_bounds),
P_S_untested = sample_beta_density(nsamp,
mean = s_untested_mean,
sd = s_untested_sd,
bounds = s_untested_bounds)) %>%
mutate(phi_induced = est_P_A_testpos(P_S_untested = P_S_untested,
alpha = alpha,
beta=beta))
# theta contains values sampled from alpha, beta, P_S_untested, and M(theta) = phi_induced
# induced phi
phi <- theta$phi_induced
# approximate induced distribution via a density approximation
phi_induced_density <- density(x = phi, n = nsamp, adjust = 2, kernel = "gaussian")
indexes <- findInterval(phi, phi_induced_density$x)
phi_sampled_density <- phi_induced_density$y[indexes]
dp_s0_pos <- function(x) {
beta_density(x,
mean=p_s0_pos_mean,
sd = p_s0_pos_sd,
bounds=p_s0_pos_bounds)
}
weights <- (phi_sampled_density/ dp_s0_pos(phi))^(.5)
post_samp_ind <-sample.int(n=nsamp,
size=nsamp,
prob=1/weights,
replace=TRUE)
pi_samp <- cbind(theta[post_samp_ind,],
P_A_testpos = phi[post_samp_ind]) %>%
select(-phi_induced)
return(list(post_melding = pi_samp, pre_melding = theta))
}
#' reformat for plot generation
reformat_melded <- function(melded_df,
theta_df,
nsamp,
p_s0_pos_mean,
p_s0_pos_sd,
p_s0_pos_bounds) {
melded_df_long <- melded_df %>%
pivot_longer(cols=everything()) %>%
mutate(type = "After Melding")
melded <- theta_df %>%
mutate(P_A_testpos = sample_beta_density(nsamp,
mean = p_s0_pos_mean,
sd = p_s0_pos_sd,
bounds = p_s0_pos_bounds)) %>%
pivot_longer(cols=everything()) %>%
mutate(type = ifelse(
name == "phi_induced",
"Induced", "Before Melding")) %>%
mutate(name = ifelse(name == "phi_induced",
"P_A_testpos",
name)) %>%
bind_rows(melded_df_long) %>%
mutate(name = case_when(
name == "alpha" ~"$\\alpha$",
name == "beta" ~"$\\beta$",
name == "P_A_testpos" ~ "$P(S_0|test+,untested)$",
name == "P_S_untested" ~ "$P(S_1|untested)$")
) %>%
mutate(name = factor(name,
levels = c(
"$\\alpha$",
"$\\beta$",
"$P(S_1|untested)$",
"$P(S_0|test+,untested)$")))
}
plot_melded <- function(melded, custom_title="", nsamp) {
p1 <- melded %>%
filter(name != "$P(S_0|test+,untested)$") %>%
ggplot(aes(x = value, fill = type)) +
geom_density(alpha = .5, show.legend=FALSE) +
facet_wrap(~name,
labeller = as_labeller(
TeX, default = label_parsed),
ncol = 3,
scales = "fixed") +
theme_bw() +
theme(
# axis.text.y = element_blank(),
# axis.ticks.y = element_blank(),
axis.title = element_text(size = 18),
axis.text.x = element_text(size = 10),
plot.title =element_text(size = 18,
margin =margin(0,0, .5,0, 'cm')),
strip.text = element_text(size = 16),
legend.text = element_text(size = 16)) +
labs(title = TeX(custom_title,bold=TRUE),
subtitle =paste0("Number of Samples: ", nsamp),
fill = "",
y = "Density") +
scale_fill_manual(values = c("#5670BF", "#418F6A","#B28542")) +
guides(fill = guide_legend(keyheight = 2, keywidth = 2))
p2 <- melded %>%
filter(name == "$P(S_0|test+,untested)$") %>%
ggplot(aes(x = value, fill = type)) +
geom_density(alpha = .5) +
facet_wrap(~name,
labeller = as_labeller(
TeX, default = label_parsed),
ncol = 3,
scales = "fixed") +
theme_bw() +
theme(
# axis.text.y = element_blank(),
# axis.ticks.y = element_blank(),
axis.title = element_text(size = 18),
axis.text.x = element_text(size = 10),
plot.title =element_text(size = 18,
margin =margin(0,0, .5,0, 'cm')),
strip.text = element_text(size = 16),
legend.text = element_text(size = 18)) +
labs(
#title = paste0("Number of Samples: ", nsamp),
fill = "",
y = "Density") +
scale_fill_manual(values = c("#5670BF", "#418F6A","#B28542")) +
guides(fill = guide_legend(keyheight = 2, keywidth = 2)) +
xlim(0,1)
p1 / p2 + plot_layout(nrow =2, widths = c(4,1))
}
plot_corrected <- function(corrected_df) {
corrected_original %>%
bind_rows(corrected_df) %>%
ggplot(aes(x = date,
ymin = exp_cases_lb,
ymax= exp_cases_ub,
fill = version)) +
geom_ribbon(alpha = .6) +
labs(x = "Date",
y = "Estimated Total Cases\nfor 2-week Interval",
fill = "",
title = "Corrected Estimates for Suffolk County") +
theme_bw() +
theme(axis.text.x = element_text(angle = 30,
vjust = .5,
size = 12),
axis.title = element_text(size = 16,
face = "bold"),
legend.text = element_text(size = 16),
legend.position = "right",
plot.title = element_text(face = "bold",
hjust = .5,
size =22)) +
scale_y_continuous(labels = scales::comma) +
scale_x_date(date_breaks = "2 months", date_labels = "%b %Y") +
scale_fill_manual(values = c("#C94136", "#4297F8")) +
guides(fill = guide_legend(keyheight = 2, keywidth = 2))
}# set prior parameters
prior_params <- list(
alpha_mean = .95,
alpha_sd = 0.08,
alpha_bounds = NA,
# alpha_bounds = c(.8,1),
beta_mean = .15,
beta_sd =.09,
beta_bounds = NA,
# beta_bounds = c(0.002, 0.4),
s_untested_mean = .03,
s_untested_sd = .0225,
# s_untested_bounds = c(0.0018, Inf),
s_untested_bounds = NA,
p_s0_pos_mean = .4,
p_s0_pos_sd = .1225,
p_s0_pos_bounds = NA,
# p_s0_pos_bounds = c(.25, .7),
pre_nsamp = 1e6,
post_nsamp = 1e5)# needed for plot_melded
library(patchwork)
get_melded <- function(alpha_mean = 0.9,
alpha_sd = 0.04,
alpha_bounds = NA,
beta_mean = .15,
beta_sd =.09,
beta_bounds = NA,
s_untested_mean = .025,
s_untested_sd = .0225,
s_untested_bounds = NA,
p_s0_pos_mean = .4,
p_s0_pos_sd = .1225,
p_s0_pos_bounds = NA,
pre_nsamp = 1e6,
post_nsamp = 1e5) {
given_args <- as.list(environment())
cat("Arguments to get_melded:\n")
print(given_args)
theta <- tibble(alpha = sample_gamma_density(pre_nsamp,
mean = alpha_mean,
sd = alpha_sd,
bounds = alpha_bounds),
beta= sample_beta_density(pre_nsamp,
mean = beta_mean,
sd = beta_sd,
bounds = beta_bounds),
P_S_untested = sample_beta_density(pre_nsamp,
mean = s_untested_mean,
sd = s_untested_sd,
bounds = s_untested_bounds)) %>%
mutate(phi_induced = est_P_A_testpos(P_S_untested = P_S_untested,
alpha = alpha,
beta=beta))
# message(paste0("nrows of theta: ", nrow(theta)))
# theta contains values sampled from alpha, beta, P_S_untested, and M(theta) = phi_induced
# induced phi
phi <- theta$phi_induced
# approximate induced distribution via a density approximation
phi_induced_density <- density(x = phi, n = pre_nsamp, adjust = 2, kernel = "gaussian")
indexes <- findInterval(phi, phi_induced_density$x)
phi_sampled_density <- phi_induced_density$y[indexes]
dp_s0_pos <- function(x) {
beta_density(x,
mean=p_s0_pos_mean,
sd = p_s0_pos_sd,
bounds=p_s0_pos_bounds)
}
weights <- (phi_sampled_density/ dp_s0_pos(phi))^(.5)
post_samp_ind <-sample.int(n=pre_nsamp,
size=post_nsamp,
prob=1/weights,
replace=TRUE)
post_melding <- bind_cols(theta[post_samp_ind,],
P_A_testpos = phi[post_samp_ind]) %>%
select(-phi_induced)
return(list(post_melding = post_melding, pre_melding = theta))
# return(post_melding)
}
#' reformat for plot generation
reformat_melded <- function(melded_df,
theta_df,
pre_nsamp,
p_s0_pos_mean,
p_s0_pos_sd,
p_s0_pos_bounds) {
melded_df_long <- melded_df %>%
pivot_longer(cols=everything()) %>%
mutate(type = "After Melding")
melded <- theta_df %>%
mutate(P_A_testpos = sample_beta_density(pre_nsamp,
mean = p_s0_pos_mean,
sd = p_s0_pos_sd,
bounds = p_s0_pos_bounds)) %>%
pivot_longer(cols=everything()) %>%
mutate(type = ifelse(
name == "phi_induced",
"Induced", "Before Melding")) %>%
mutate(name = ifelse(name == "phi_induced",
"P_A_testpos",
name)) %>%
bind_rows(melded_df_long) %>%
mutate(name = case_when(
name == "alpha" ~"$\\alpha$",
name == "beta" ~"$\\beta$",
name == "P_A_testpos" ~ "$P(S_0|test+,untested)$",
name == "P_S_untested" ~ "$P(S_1|untested)$")
) %>%
mutate(name = factor(name,
levels = c(
"$\\alpha$",
"$\\beta$",
"$P(S_1|untested)$",
"$P(S_0|test+,untested)$")))
}
plot_melded <- function(melded, custom_title="", nsamp) {
p1 <- melded %>%
filter(name != "$P(S_0|test+,untested)$") %>%
ggplot(aes(x = value, fill = type)) +
geom_density(alpha = .5, show.legend=FALSE) +
facet_wrap(~name,
labeller = as_labeller(
TeX, default = label_parsed),
ncol = 3,
scales = "fixed") +
theme_bw() +
theme(
# axis.text.y = element_blank(),
# axis.ticks.y = element_blank(),
axis.title = element_text(size = 18),
axis.text.x = element_text(size = 10),
plot.title =element_text(size = 18,
margin =margin(0,0, .5,0, 'cm')),
strip.text = element_text(size = 16),
legend.text = element_text(size = 16)) +
labs(title = TeX(custom_title,bold=TRUE),
subtitle =paste0("Number of Samples: ", nsamp),
fill = "",
y = "Density") +
scale_fill_manual(values = c("#5670BF", "#418F6A","#B28542")) +
guides(fill = guide_legend(keyheight = 2, keywidth = 2))
p2 <- melded %>%
filter(name == "$P(S_0|test+,untested)$") %>%
ggplot(aes(x = value, fill = type)) +
geom_density(alpha = .5) +
facet_wrap(~name,
labeller = as_labeller(
TeX, default = label_parsed),
ncol = 3,
scales = "fixed") +
theme_bw() +
theme(
# axis.text.y = element_blank(),
# axis.ticks.y = element_blank(),
axis.title = element_text(size = 18),
axis.text.x = element_text(size = 10),
plot.title =element_text(size = 18,
margin =margin(0,0, .5,0, 'cm')),
strip.text = element_text(size = 16),
legend.text = element_text(size = 18)) +
labs(
#title = paste0("Number of Samples: ", nsamp),
fill = "",
y = "Density") +
scale_fill_manual(values = c("#5670BF", "#418F6A","#B28542")) +
guides(fill = guide_legend(keyheight = 2, keywidth = 2)) +
xlim(0,1)
p1 / p2 + plot_layout(nrow =2, widths = c(4,1))
}params <- prior_params
melded <- do.call(get_melded, params)## Arguments to get_melded:
## $alpha_mean
## [1] 0.95
##
## $alpha_sd
## [1] 0.08
##
## $alpha_bounds
## [1] NA
##
## $beta_mean
## [1] 0.15
##
## $beta_sd
## [1] 0.09
##
## $beta_bounds
## [1] NA
##
## $s_untested_mean
## [1] 0.03
##
## $s_untested_sd
## [1] 0.0225
##
## $s_untested_bounds
## [1] NA
##
## $p_s0_pos_mean
## [1] 0.4
##
## $p_s0_pos_sd
## [1] 0.1225
##
## $p_s0_pos_bounds
## [1] NA
##
## $pre_nsamp
## [1] 1e+06
##
## $post_nsamp
## [1] 1e+05
melded_long <- reformat_melded(melded_df = melded$post_melding,
theta_df = melded$pre_melding,
p_s0_pos_mean = params$p_s0_pos_mean,
p_s0_pos_sd = params$p_s0_pos_sd,
p_s0_pos_bounds = params$p_s0_pos_bounds,
pre_nsamp = params$pre_nsamp)
plot_melded(melded_long, nsamp = params$post_nsamp)